Efficacy of a computer based discontinuation strategy to reduce PPI prescriptions: a multicenter cluster-randomized controlled trial

Deprescribing of inappropriate long-term proton pump inhibitors (PPI) is challenging and there is a lack of useful methods for general practitioners to tackle this. The objective of this randomized controlled trial was to evaluate the effectiveness of the electronic decision aid tool arriba-PPI on reduction of long-term PPI intake. Participants (64.5 ± 12.9 years; 54.4% women) with a PPI intake of at least 6 months were randomized to receive either consultation with arriba-PPI from their general practitioner (n = 1256) or treatment as usual (n = 1131). PPI prescriptions were monitored 6 months before, 6 and 12 months after study initiation. In 49.2% of the consultations with arriba-PPI, the general practitioners and their patients made the decision to reduce or discontinue PPI intake. At 6 months, there was a significant reduction by 22.3% (95% CI 18.55 to 25.98; p < 0.0001) of defined daily doses (DDD) of PPI. A reduction of 3.3% (95% CI − 7.18 to + 0.62) was observed in the control group. At 12 months, the reduction of DDD-PPI remained stable in intervention patients (+ 3.5%, 95% CI − 0.99 to + 8.03), whereas control patients showed a reduction of DDD-PPI (− 10.2%, 95% CI − 6.01 to − 14.33). Consultation with arriba-PPI led to reduced prescription rates of PPI in primary care practices. Arriba-PPI can be a helpful tool for general practitioners to start a conversation with their patients about risks of long-term PPI intake, reduction or deprescribing unnecessary PPI medication.


Participants
Inclusion criteria for general practitioners (GPs) were German as the main language in patient care, technical requirements for using a computer based tool in the consultation room, willingness to share PPI prescription data from their practice software and consent to collect data about PPI prescription data by the German health insurance company AOK.Excluded were physicians with specialized focus without regular PPI prescriptions (e.g.psychological therapy or acupuncture practices) or without a computing management system.
Inclusion criteria for patients comprised age ≥ 18 years and regular PPI prescription for at least 6 months.Definition of long-term use varies widely in the literature (most common are ≥ 8 weeks up to 1 year), depending on outcome measurements of side effects or initial diagnosis, among others 25,26 .In the context of our study, we define an intake as "long-term" if it lasts ≥ 6 months.According to the guideline, after this time the healing periods for the most common indications should be completed and a reassessment should have taken place 27 .
Not included were patients with poor German language skills, cognitive impairments that hinder a study information or consent and housebound patients (e.g.patients with fragilities that impede a personal visit to the physician's practice).

Complex intervention
The intervention for the patients was delivered at the level of general practice and was prepared by members of the universities' study teams as follows.After randomization, study team members visited each practice from the intervention group to install the arriba-PPI software on a computer in the treatment room.The GP was then trained in discontinuation strategies, shared decision making and how to use arriba-PPI with a 10 min video in German language (https:// arriba-hausa rzt.de/ module/ ppi-proto nenpu mpen-hemmer-abset zen).The day of training in the intervention group was defined as T0.Within the following 6 weeks, study patients were scheduled for an appointment with their GP to receive consultation with arriba-PPI.For further care and until end of study, the GPs were instructed to treat their study patients as necessary and as usual for 12 months.

Control
After randomization, control practices were directed by phone to treat their study patients as usual for 12 months.These practices did not make any extra appointments with their patients for this study.The day of the phone call in the control group was defined as T0.

Telephone interview
All patients of the intervention and control group were interviewed by telephone at T1 to gain information about their current PPI medication and the reasons for taking it.

Data collection
Baseline characteristics of GPs and patients were collected with questionnaires and recruitment lists provided during study personnel visits in the practices.PPI prescription data (agent, dose, package size, prescription date) were obtained retrospectively from the practice software in each general practice for a time span of 6 months before T0 to T2 (1.5 years).These data were collected during visits of study personnel in each practice or after instructions by phone and delivery per mail, E-mail or fax, depending on the preferences of practice personnel.
Collected PPI data was analyzed according to Germany's WIdO ATC/DDD methodology that defines medication doses typically used in the main indication in adults per day 28 .The DDD is a measure of the amount of drug prescribed.It is calculated as follows (using pantoprazole as an example): the number of tablets prescribed (e.g. 100 tablets) is multiplied by the prescribed dose of the respective tablets (e.g.40 mg), and divided by the dose typically used in the main indication in adults per day (for pantoprazole, for example, this corresponds to 20 mg) which results in 200 DDD.According to this procedure, DDD were calculated for each patient for a time span of 6 months at T0, T1 and T2.

Outcomes
Primary outcome was the comparison of cumulated defined daily doses (DDD) of PPI per study patient after 6 months in the intervention and control group (T1).Secondary outcomes were the cumulated DDDs of PPI per study patient after 12 months (T2) and PPI intake status over time.

Sample size
Sample size calculation was described in Rieckert et al. 24 , but was changed due to lower recruitment success.Preliminary results led to the assumption that a 20% reduction in DDD PPI compared to control could be achieved (instead of 15% reduction as originally hypothesized).Assuming an ICC of 0.07 and a significance level of 0.05, a power of 80% can be achieved with 94 practices of 15 patients each (instead of an ICC of 0.1 and 204 practices).

Randomization
The GP practice was regarded as the unit of randomization.A simple randomization scheme was generated by the random package of the programme R.An independent trusted person outside the study team at the universities assigned practices (and their patients) to intervention or control group according to the randomization sequence.To assure concealment of allocation, no patient was included once recruitment was completed and randomization was performed.

Blinding
Due to the nature of the intervention, neither GPs nor patients were blinded.For practical reasons, study personnel were not blinded either.However, a blinded statistician conducted all analyses.

Statistical methods
Categorical variables regarding demographic characteristics of the general practices and participating patients were analysed using the chi-square test and the corresponding effect size Cramér V. Values < 0.20 signal a small effect, between 0.21 and 0.39 there is a moderate effect, and values > 0.40 signal a strong effect 29 .If there were more than 25% of cells in contingency tables with expected frequencies less than 5, Fisher´s Exact Test was used.For metric data the t-test was performed.Cohen's d was used as an effect size, with a value of 0.2 to 0.49 representing a small effect, a value of 0.5 to 0.79 representing a medium effect, and a value of 0.8 and higher representing a large effect 29 .

Multilevel multiple imputation
For the main outcome variables of the sum of DDD of PPI, there were 0.7% missing values at time T0, 2.8% missing values at time T1, and 10.8% missing values at time T2, with these occurring exclusively in the intervention group at time T0 and similar distribution between the two groups at the other time points.Multilevel multiple imputation was performed using the R package mice 3.11.0 and method 2l.norm, which uses the linear mixed model with heterogeneous error variance 30 .The model included the additional variables age, gender, number of prescriptions at T0, T1 and T2.For the latter, missing data were replaced as well.The same proportions of missing values existed as for the DDD-PPI sum variables.Inclusion of additional variables did not result in convergence of the model.Twenty data sets each were imputed separately for the intervention and control groups, and the resulting objects were merged for subsequent analyses.The robustness of the results was tested by a complete case analysis 31,32 .

Multilevel modeling
The primary outcome was evaluated using multilevel analyses within Programme Package R 33 .These take into account the clustering of patients in practices and allow for different modeling with respect to predictors, e.g. group membership as fixed and/or random effect.An intention-to-treat analysis was applied.
Norman advocates an ANCOVA design in which correction is made for the baseline value and regression is performed on the outcome.This would also take into account the regression to the mean, since individuals with extreme values at the beginning of a study would tend to spontaneously converge to the mean of the respective sample 34 .The use of percentage change values related to a baseline is criticized in several publications.It is stated that the calculation of percent change values is statistically inefficient, and it is argued that they should not be used, but instead a covariance analysis approach (ANCOVA) with the baseline value as covariate.Various simulations were calculated for different correlations between baseline and follow-up scores, and it was concluded that the percent change value had poor statistical efficiency for all correlations [35][36][37] .Adjusting for a baseline covariate can further improve the power of the comparisons and reduce the Intra-Class-Correlation-Coefficient which will improve the power 38 .We therefore fitted a multilevel model at T1 with cumulated defined daily doses (DDD) of PPI at T1 as the dependent variable, group as a predictor variable and cumulated defined daily doses (DDD) of PPI at T0 as a covariate.A multilevel model at T2 with cumulated defined daily doses (DDD) of PPI at T2 as the dependent variable, group as a predictor variable and cumulated defined daily doses (DDD) of PPI at T0 and T1 as covariates was also performed.Restricted Maximum Likelihood (REML) was used as estimation method.Adjusted means for the intervention and control groups were calculated and percentage reductions in the prescription of DDD of PPI were calculated.
We also calculated a longitudinal random intercepts model predicting cumulated defined daily doses (DDD) of PPI over time and using group membership as a predictor.We considered results with p values ≤ 0.05 to be significant.All analyses were performed with R version 4.0.2 and packages mice, broom.mixed,lme4, lmerTest, lmtest, mitml, emmeans, mosaic.

Patient and public involvement
No patients or members of the public were involved in the design, analysis, interpretation or writing of the study.It was not the policy of the involved institutions to include patients in the planning or decision making processes at the time when the study was planned, submitted to ethical committees and funding agencies, and started.

Participant flow
In total, 2440 patients with a PPI medication of at least 6 months were recruited (Fig. 1).The patients were randomized on the practice level to the intervention or control group.After randomization, 53 patients were excluded from data collection because of death (n = 25), consent withdrawal (n = 26) or other reasons (n = 2; dementia, recruitment error).On average, 17 patients per practice were included in 143 practices.The intention-to-treat group comprised 1256 patients in the intervention group and 1131 patients in the control group with a ratio of 1.1 to 1.With regard to data collection, complete DDD data were available for 97.1% patients at T0, for 95.1% patients at T1, and for 87.3% at T2, and 84.4% of all patients were interviewed by phone at T1.

Baseline data
Characteristics of GPs (n = 158) were well-balanced between intervention and control group regarding gender, age, practical experience, and practice location (Table 1).However, for general practices (n = 143), some differences appeared.In the intervention group, a slight predominance of group practices and less single practices occurred, but the effect size Cramér-V signaled a weak association 29 .Also, there was a trend towards a larger size of intervention practices compared to control practices of more medium size, but the effect size Cramér-V was moderate.At baseline, the recruited 2387 patients were 64.46 years of age (± 12.94) and 54.4% were female (Table 2).The most prevalent prescribed PPI agents were pantoprazole (64.4%) and omeprazole (29.7%) with an average DDD of 250 (± 8.0; independent of the PPI agent).The most common indications for PPI use were gastroesophageal reflux (41.4%) and gastroprotection as a preventive measure or together with NSAID/ASS (27.5%).More details are shown in Table 2.The characteristics between intervention and control group were well-balanced in terms of gender, age, prescribed PPI agent, defined daily dose (DDD) of PPI and indication for PPI uptake.The statistically significant differences which were observed are based predominantly on Chi-Square tests which are known to be dependent on sample size and they all have negligible effect sizes 29 .

Primary outcome
The null model at T0 had an Intra-Class-Correlation Coefficient of 0.093 meaning that the correlation of cumulated defined daily doses (DDD) of PPI at T0 among patients within the same practices is about this value.Consequently, most of the variation in the outcome is among the lower-level units and therefore, correlation www.nature.com/scientificreports/ between them is relatively low 39 .After inclusion of the variable group as predictor there was no significant difference between the two groups at T0 regarding the DDD of PPI (p = 0.29).Means for cumulated defined daily doses (DDD) of PPI at T0 after inclusion of the cluster structure were 256 DDD (SE 7.86; 95% CI 240 to 271) for the intervention group and 244 DDD (SE 8.12; 95% CI 228 to 260) for the control group.At T1 after the introduction of cumulated defined daily doses (DDD) of PPI at T0 as a covariate, the Intra-Class-Correlation Coefficient was reduced to 0.042.There was a significant difference between the two groups (p < 0.0001) in favour of the intervention group (Table 3).Means for cumulated defined daily doses (DDD) of PPI at T1 after considering cumulated defined daily doses (DDD) of PPI at T0 as a covariate were 199 DDD (SE 5.50; 95% CI 188-210) for the intervention group and 236 DDD (SE 5.95; 95% CI 224 to 248) for the control group.
Compared to baseline (T0), there was a significant reduction in the PPI prescriptions after 6 months (T1) among study patients of intervention group (reduction of the mean PPI DDD: − 22.3% (95%CI − 18.55 to − 25.98), see Fig. 2. A reduction in PPI prescription of − 3.3% was observed in the control group (95% CI − 7.18 to + 0.62).
At T2 after the introduction of cumulated defined daily doses (DDD) of PPI at T0 and T1 as covariates, the Intra-Class-Correlation Coefficient was 0.05.There was no significant difference between the two groups (p = 0.48) (Table 4).Means for cumulated defined daily doses (DDD) of PPI at T2 after considering cumulated defined daily doses (DDD) of PPI at T0 and T1 as covariates were 206 DDD (SE 5.71; 95% CI 195 to 218) for the intervention group and 212 DDD (SE 6.26; 95% CI 200 to 225) for the control group.www.nature.com/scientificreports/Follow-up at T2 showed that reduced DDD PPI remained stable in intervention patients with no further significant change compared to T1: + 3.5% (95%CI − 0.99 to + 8.03.Control patients showed a decrease in prescribed DDD PPI of -10.17% (95%CI − 6.01 to − 14.33) at T2 compared to T1.
The longitudinal random intercepts model confirmed the results of our multilevel ANCOVA models.The time effect was significant (estimate − 27.40, SE 3.36, p < 0.0001), as the number of DDD of PPI decreases over time.The group allocation did not show significance, as there was no significant difference between the intervention and control groups at both time T0 and time T2 (control, estimate − 1.32, SE 11.09, p = 0.91).The time*group interaction, on the other hand, was significant, as there was a differential trend with a significant difference at time T1 (estimate 15.82, SE 4.90, p = 0.001).
There were no relevant differences between the models with complete cases and with imputed data.

PPI intake status over time
Figure 3 shows the PPI intake status of patients over time according to the information provided by patients in the telephone interviews at T1.Among the patients in the intervention group who had decided to discontinue PPIs during their arriba-PPI based consultation at T0, 41.9% stayed on their course to stop taking PPIs 6 months later.More than half of these patients started taking PPIs again.Among participants who had decided for dose reduction during the consultation, 11.4% discontinued their PPI over the course of the study.Only few changes in PPI medication occurred among participants who had decided not to change their PPI medication at T0 and who were in the control group.

Principal findings
In this study, the use of the discontinuation strategy arriba-PPI in general practice resulted in a greater reduction in PPI prescriptions than with usual care.Consultation with arriba-PPI led to discontinuation or reduction attempts in almost half the patients.More than a third of the patients who had decided to discontinue PPI are still taking no PPI at 6 months.Overall, DDD PPI prescription rates in all intervention patients were significantly decreased by 22% at 6 months and that level was maintained at 12 months.

Discussion of the literature
Interventions that aim to change provider behavior to reach deprescribing of unnecessary PPI medication show similar outcomes.Lai et al. 18 and Cateau et al. 40 describe education sessions for physicians that range from 1h to half a day that show significant deprescribing results (47% success rate at 4 months and 13% at 1 year, respectively).These and our studies have an educational component in common.The physician has to give the impulse for a deprescribing conversation that also reflects his own beliefs, to be believable and trustworthy for the patient 41 .Publication of international guidelines is not sufficient and shows no or slight difference in prescribing patterns 17,42 .Rigid methods like a deprescribing algorithm or installing an Excel file with an embedded, automated scoring system show significant but lower success 19,20 .Studies aiming at incorporating patients' views and addressing their fears of poor symptomatic control, offer a self-management plan or provide measures for rebound problems and report 75%-83% stepping down or off PPI [43][44][45] .It should be noted that for these studies investigators prescreened their participants for inappropriate PPI use, whereas in our study we included all patients with long-term PPI use.arriba-PPI offers the possibility to decide whether a PPI is inappropriate.It seems crucial to include the patient in these deprescribing conversations to reach a common understanding 41 .Important topics to discuss are the necessity of PPI medication, the possibility of rebound symptoms, and symptom avoidance and control.It is essential for patients to understand www.nature.com/scientificreports/why a PPI could be discontinued and for physicians to acknowledge their concern of recurring symptoms.Patient preferences are not always incorporated into PPI deprescribing decisions 46,47 .But some patients prefer to participate in a shared decision about deprescribing, including discussing their preferences, while other, mostly older patients, trust their physician to decide for them 48 .Discontinuation studies that deprescribe PPI without consultation or shared decision making, e.g. when there is no indication for PPI use after endoscopy, are also successful (27% or 15% off PPI after 1 year, without symptoms), although, patients with troublesome GERD presumably did not participate in the study 49,50 .

How did the intervention work?
Apart from the software, our complex intervention included outreach visits (or "academic detailing", AD 51 ) with study team members visiting intervention group practices to deliver evidence-based information and educational contents about harmful effects of long-term PPI use and deprescribing.Chhina et al. 52 showed in their review that AD can be effective at optimizing prescription of medications by GPs, albeit with overall moderate effect.The software arriba-PPI adds to the educational component by scaffolding new behavior and reinforcing self-efficacy of the GPs regarding PPI withdrawal.On one hand, clinicians and policymakers adopting decision support software should be aware that additional measures are required for implementation.On the other hand, implementation of a software can have effects beyond the use of the software itself.We report further insights from qualitative interviews with GPs and patients including a programme theory in a separate publication 53 .

Barriers
In our study, most patients state complaints or gastroprotection and fewer self-reported diagnosis as reason for their PPI use at T1.This reflects the importance of symptomatic control.That makes PPI "notoriously hard to reduce" because they offer symptomatic relief in comparison to other PIMs like statins 40 .Some GPs expect the stereotype of a patient demanding PPI prescriptions or of using PPI to support an unhealthy lifestyle 15 .However, many patients actually use PPI to maintain a healthier lifestyle, like eating fruit or drinking less alcohol.Patients are also more concerned about side-effects and the safety of PPIs than their doctors realized.Other studies show that a lot of patients (40-70%) are open-minded to discuss deprescribing and would like to take the lowest effective dose 16 .Patients understand the rationale for deprescribing and appreciate receiving specific advice on a deprescribing plan.Biggest concern is the possibility of symptom returning, 68% of patients do not condone the return of any symptoms, even minor ones, but patients are encouraged if they know they can restart their PPI medication if necessary.Prolonged reflux symptoms on PPI therapy are associated with reduced physical and mental quality of life 54 .
Gastroprotection with PPI during NSAID therapy may become relevant in patients older than 60 years and with risk factors 55 .Even in those groups, the risk of peptic ulcer disease (PUD) is not sufficiently high to warrant gastroprotection to everyone.Patients starting long-term NSAID or ASA therapy should be tested for H. pylori first to start eradication treatment; this will sufficiently reduce PUD complications 55 .Additional risk factors can be determined by a scoring system (Table 1 55 ).

Strengths and limitations of the study
This study had numerous strengths: To counter selection bias at recruitment, patients were asked systematically and consecutively in the general practice when picking up their refill prescription for PPI.Medical assistants proposed to participate in a study to discuss medication status of their stomach medication with their physician; a possible discontinuation was not yet addressed at this point.
By selecting three study sites across federal state lines, the results were not localized and possible differences in PPI prescription patterns in different states were compensated.Furthermore, the German health insurance company BARMER identified high prescribing practices in Hessen and Westphalia and invited the top 13.8% of the PPI prescribing practices to participate in our study.At least 10 out of 143 study practices were high prescribers.The sample of practices was thus highly generalizable.Although this was a cluster-randomized study, relevant characteristics of participating patients and practices were well balanced between study arms.
For studies of this kind, practices interested in the topic are easier to motivate to take part.Since these have often reflected on their prescribing behavior, ceiling effects may result.Despite this, the implementation of arriba-PPI was effective at T1.
Between the 6th and 12th month of data collection, there was also a decrease in PPI prescriptions in the control group, so that the difference between the intervention and control group was no longer significant at the end of the data collection period.Measures in the health care system were introduced to reduce PPI prescribing.This included not only information on risks and cost, but also threatened prescribing GPs with monetary sanctions for inappropriate prescribing.This could have resulted in a more conscious way to prescribe PPI.
We also cannot exclude a certain degree of contamination.The patient interview at 6 months may have influenced the control group's attitude toward PPI use.As there was no consultation planned in this group, the phone interview was the only conversation for these patients concerning their medication.Also there could be a certain degree of social desirability bias when patients were directly asked about their PPI uptake.Lastly, interviewers were not blinded.

Clinical impact and future research
arriba-PPI is an effective tool that can also be used as a source of personal training.Even when not applied in daily consultations, the physician is attuned to PPI overuse and can give his patients an impulse to change their way of thinking.As most of the conservations with patients are centered on planning and follow-up, information and training should happen beforehand 56 .Patient-specific deprescribing interventions lead to less medication 13:21633 | https://doi.org/10.1038/s41598-023-48839-2

Figure 2 .
Figure 2. Course of the adjusted means (adjusted at T1 and T2 by the ANCOVA models) of the DDD of PPI in the intervention and control groups (average DDD with standard errors). )

Table 1 .
Characteristics of general practices.a In some practices, more than one physician is recruited per practice (single-handed practices are still common in Germany).b Practice size is determined by health insurance certificates per quarter; small: < 900 certificates, medium: 900-1500 certificates, large: > 1500 certificates.

Table 3 .
Results for the multilevel model at T1 with cumulated defined daily doses (DDD) of PPI at T0 as a covariate (ANCOVA model).

Table 4 .
Results for the Multilevel model at T1 with cumulated defined daily doses (DDD) of PPI at T0 and T1 as covariates (ANCOVA model).